Imitation Learning with a Value-Based Prior

نویسندگان

Umar Syed

Robert E. Schapire

چکیده

The goal of imitation learning is for an apprentice to learn how to behave in a stochastic environment by observing a mentor demonstrating the correct behavior. Accurate prior knowledge about the correct behavior can reduce the need for demonstrations from the mentor. We present a novel approach to encoding prior knowledge about the correct behavior, where we assume that this prior knowledge takes the form of a Markov Decision Process (MDP) that is used by the apprentice as a rough and imperfect model of the mentor’s behavior. Specifically, taking a Bayesian approach, we treat the value of a policy in this modelingMDP as the log prior probability of the policy. In other words, we assume a priori that the mentor’s behavior is likely to be a highvalue policy in the modeling MDP, though quite possibly different from the optimal policy. We describe an efficient algorithm that, given a modelingMDP and a set of demonstrations by a mentor, provably converges to a stationary point of the log posterior of the mentor’s policy, where the posterior is computed with respect to the “valuebased” prior. We also present empirical evidence that this prior does in fact speed learning of the mentor’s policy, and is an improvement in our experiments over similar previous methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robots Learn to Recognize Individuals from Imitative Encounters with People and Avatars.

Prior to language, human infants are prolific imitators. Developmental science grounds infant imitation in the neural coding of actions, and highlights the use of imitation for learning from and about people. Here, we used computational modeling and a robot implementation to explore the functional value of action imitation. We report 3 experiments using a mutual imitation task between robots, a...

متن کامل

A Bayesian Model of Imitation in Infants and Robots

Learning through imitation is a powerful and versatile method for acquiring new behaviors. In humans, a wide range of behaviors, from styles of social interaction to tool use, are passed from one generation to another through imitative learning. Although imitation evolved through Darwinian means, it achieves Lamarckian ends: it is a mechanism for the inheritance of acquired characteristics. Unl...

متن کامل

Observed Body Clustering for Imitation Based on Value System

In order to develop skills, actions, and behavior in a human symbiotic environment, a robot must learn something from behavior observation of predecessors or humans. Recently, robotic imitation methods based on many approaches have been proposed. We have proposed reinforcement learning based approaches for the imitation and investigated them under an assumption that an observer recognizes the b...

متن کامل

A Bayesian Approach to Imitation in Reinforcement Learning

In multiagent environments, forms of social learning such as teaching and imitation have been shown to aid the transfer of knowledge from experts to learners in reinforcement learning (RL). We recast the problem of imitation in a Bayesian framework. Our Bayesian imitation model allows a learner to smoothly pool prior knowledge, data obtained through interaction with the environment, and informa...

متن کامل

Imitation Learning in Relational Domains: A Functional-Gradient Boosting Approach

Imitation learning refers to the problem of learning how to behave by observing a teacher in action. We consider imitation learning in relational domains, in which there is a varying number of objects and relations among them. In prior work, simple relational policies are learned by viewing imitation learning as supervised learning of a function from states to actions. For propositional worlds,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Imitation Learning with a Value-Based Prior

نویسندگان

چکیده

منابع مشابه

Robots Learn to Recognize Individuals from Imitative Encounters with People and Avatars.

A Bayesian Model of Imitation in Infants and Robots

Observed Body Clustering for Imitation Based on Value System

A Bayesian Approach to Imitation in Reinforcement Learning

Imitation Learning in Relational Domains: A Functional-Gradient Boosting Approach

عنوان ژورنال:

اشتراک گذاری